Training a Sentence-Level Machine Translation Confidence Measure
نویسنده
چکیده
We present a supervised method for training a sentence level confidence measure on translation output using a humanannotated corpus. We evaluate a variety of machine learning methods. The resultant measure, while trained on a very small dataset, correlates well with human judgments, and proves to be effective on one task based evaluation. Although the experiments have only been run on one MT system, we believe the nature of the features gathered are general enough that the approach will also work well on other systems.
منابع مشابه
Confidence Measure for Word Alignment
In this paper we present a confidence measure for word alignment based on the posterior probability of alignment links. We introduce sentence alignment confidence measure and alignment link confidence measure. Based on these measures, we improve the alignment quality by selecting high confidence sentence alignments and alignment links from multiple word alignments of the same sentence pair. Add...
متن کاملEfficient combination of confidence measures for machine translation
We present in this paper a twofold contribution to Confidence Measures for Machine Translation. First, in order to train and test confidence measures, we present a method to automatically build corpora containing realistic errors. Errors introduced into reference translation simulate classical machine translation errors (word deletion and word substitution), and are supervised byWordnet. Second...
متن کاملGoodness: A Method for Measuring Machine Translation Confidence
State-of-the-art statistical machine translation (MT) systems have made significant progress towards producing user-acceptable translation output. However, there is still no efficient way for MT systems to inform users which words are likely translated correctly and how confident it is about the whole sentence. We propose a novel framework to predict wordlevel and sentence-level MT errors with ...
متن کاملThe Regression Model of Machine Translation
Machine translation is the task of automatically nding the translation of a source sentence in the target language. Statistical machine translation (SMT) use parallel corpora or bilingual paired corpora that are known to be translations of each other to nd a likely translation for a given source sentence based on the observed translations. The task of machine translation can be seen as an insta...
متن کاملاستخراج پیکره موازی از اسناد قابلمقایسه برای بهبود کیفیت ترجمه در سیستمهای ترجمه ماشینی
Data used for training statistical machine translation method are usually prepared from three resources: parallel, non-parallel and comparable text corpora. Parallel corpora are an ideal resource for translation but due to lack of these kinds of texts, non-parallel and comparable corpora are used either for parallel text extraction. Most of existing methods for exploiting comparable corpora loo...
متن کامل